On Multiplicative Entropy and Information gain in Large Data Sets

نویسندگان

  • Udayan Ghose
  • UDAYAN GHOSE
  • YOGESH SINGH
چکیده

Information theory is one of the widely used branches of applied probability theory. When probability is used to describe the state of a system implies that the state has some uncertainty. Some probability distributions indicate more uncertainty than others as they are not created equal. We can come up with some mathematical entity which returns a measure of uncertainty after taking a probability distribution as input. It has been observed that the mutual information between two variables is the reduction in uncertainty of one variable due to information of other. In this paper a new approach is taken to look into the multiplicative nature of entropy and the conditional entropy. Then on the basis of this information gain is calculated using a large data set. The data set considered, are the scanned OMR application forms of the candidates applying in engineering courses of a University. Simulation has been done using such data and information gain is calculated using some predefined parameters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets

With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

متن کامل

Multimodal medical image fusion based on Yager’s intuitionistic fuzzy sets

The objective of image fusion for medical images is to combine multiple images obtained from various sources into a single image suitable for better diagnosis. Most of the state-of-the-art image fusing technique is based on nonfuzzy sets, and the fused image so obtained lags with complementary information. Intuitionistic fuzzy sets (IFS) are determined to be more suitable for civilian, and medi...

متن کامل

Sustainable Energy Planning By A Group Decision Model With Entropy Weighting Method Under Interval-Valued Fuzzy Sets And Possibilistic Statistical Concepts

In this paper, a new interval-valued fuzzy multi-criteria group decision-making model is proposed to evaluate each of the energy plans with sustainable development criteria for proper energy plan selection. The purpose of this study is divided into two parts: first, it is aimed at determining the weights of evaluation criteria for sustainable energy planning and second at rating sustainable ene...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010